Latin Hypercubes: A Class of Multidimensional Declustering Techniques

نویسندگان

  • Bhaskar Himatsingka
  • Jaideep Srivastava
  • Jiang-Zong Li
  • Doron Rotem
چکیده

The I/O subsystem is widely accepted as one of the principal bottlenecks for high performance parallel databases systems. The emergence of parallel I/O architectures has made the problem of data declustering, i.e. fragmenting a le of records and allocating the pieces to different disks, one of prime importance. This is evident from the growing activity in this area. In this study we focus only on multi-attribute declustering methods which are based on some type of grid-based partitioning of the data space. Since the multidimensional range query is the main workhorse for applications accessing such data, the focus is to provide eecient support for it. We rst show that there exists no declustering method that is strictly optimal for range queries if the number of disks is greater than 5. Thus the focus is on using declustering methods which provides good average case performance and are also optimal for a large class of queries. A class of multidimensional declustering methods, called Latin Hypercubes, is proposed. Conditions under which this class is optimal are derived. Also provided are the worst case and average case bounds on multidimensional range query performance. A detailed experimental evaluation is carried out to see how the class compares with other declustering methods. Parameters that are varied are shape and size of queries, database size, number of attributes and the number of disks. Our ndings (theoretical and experimental) show that latin hypercubes do very well for large queries (near optimal), and partial match queries, and are within reasonable bounds of other declustering methods for small queries. Since it is not possible to have a declustering method which performs optimally for all possible range queries, our ndings help decide when to use this class of methods. Finally, since there is no clear winner, parallel database systems must support a number of declustering methods and Latin Hypercubes would invariably have to be one of them.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A General Construction for Space-filling Latin Hypercubes

Abstract: We propose a general method for constructing Latin hypercubes of flexible run sizes for computer experiments. The method makes use of arrays with a special structure and Latin hypercubes. By using different such arrays and Latin hypercubes, the proposed method produces various types of Latin hypercubes including orthogonal and nearly orthogonal Latin hypercubes, sliced Latin hypercube...

متن کامل

Study of Scalable Declustering Algorithms for Parallel Grid Files

Efficient storage and retrieval of large multidimensional datasets is an important concern for large-scale scientific computations such as long-running time-dependent simulations which periodically generate snapshots of the state. The main challenge for efficiently handling such datasets is to minimize response time for multidimensional range queries. The grid file is one of the well known acce...

متن کامل

Efficient retrieval of multidimensional datasets through parallel I/O

Many scientific and engineering applications process large multidimensional datasets. An important access pattern for these applications is the retrieval of data corresponding to ranges of values in multiple dimensions. Performance is limited by disks largely due to high disk latencies. Tiling and distributing the data across multiple disks is an effective technique for improving performance th...

متن کامل

Latin k-hypercubes

We study k dimensional Latin hypercubes of order n. We describe the automorphism groups of the hypercubes and define the parity of a hypercube and relate the parity with the determinant of a permutation hypercube. We determine the parity in the orbits of the automorphism group. Based on this definition of parity we make a conjecture similar to the Alon-Tarsi conjecture. We define an orthogonali...

متن کامل

Concentric Hyperspaces and Disk Allocation for Fast Parallel Range Searching

Data partitioning and declustering have been extensively used in the past to parallelize I/O for range queries. Numerous declustering and disk allocation techniques have been proposed in the literature. However, most of these techniques were primarily designed for two-dimensional data and for balanced partitioning of the data space. As databases increasingly integrate multimedia information in ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994